Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
FPGA-based convolutional neural network fixed-point acceleration
LEI Xiaokang, YIN Zhigang, ZHAO Ruilian
Journal of Computer Applications    2020, 40 (10): 2811-2816.   DOI: 10.11772/j.issn.1001-9081.2020020256
Abstract581)      PDF (1063KB)(809)       Save
Aiming at the problem of high running power consumption and slow operation of Convolutional Neural Network (CNN) on resource-constrained hardware devices, a method for accelerating fixed-point computation of CNN based on Field Programmable Gate Array (FPGA) was proposed. First, a fixed-point processing method was proposed. In order to reduce the storage space of the CNN parameters, different scale parameters were designed for different convolution layers and the relative divergence was used to determine the bit width length. The effect of different quantization intervals on the accuracy of CNN was studied. Then, the parameter multiplexing method and the pipeline calculation method were designed to accelerate the convolution calculation. In order to verify the acceleration effect of CNN after fixed-point processing, two datasets of face and ship were used for verification. Compared with the traditional floating-point convolution computation, on the premise of ensuring that the accuracy loss of the CNN is small, when the weight parameters and the input feature map parameters are quantized to 7-bit, on the face recognition CNN model, the proposed method has the compressed weight parameter file size of about 22% of the origin, and the convolution calculation speedup is 18.69. At the same time, the method makes the utilization rate of the multiplier-accumulator in FPGA reach 94.5%. Experimental results show that the proposed method can improve the speed of convolution calculation, and efficiently use FPGA hardware resources.
Reference | Related Articles | Metrics
Whole process optimized garbage collection for solid-state drives
FANG Caihua, LIU Jingning, TONG Wei, GAO Yang, LEI Xia, JIANG Yu
Journal of Computer Applications    2017, 37 (5): 1257-1262.   DOI: 10.11772/j.issn.1001-9081.2017.05.1257
Abstract1085)      PDF (1128KB)(526)       Save
Due to NAND flash' inherent restrictions like erase-before-write and a large erase unit, flash-based Solid-State Drives (SSD) demand garbage collection operations to reclaim invalid physical pages. However, the high overhead caused by garbage collection significantly decrease the performance and lifetime of SSD. Garbage collection performance will be more serious, especially when the data fragments of SSD are frequently used. Existing Garbage Collection (GC) algorithms only focus on some steps of the garbage collection operation, and none of them provids a comprehensive solution that takes into consideration all the steps of the GC process. On the basis of detailed analysis of the GC process, a whole process optimized garbage collection algorithm named WPO-GC (Whole Process Optimized Garbage Collection) was proposed, which integrated optimizations on each step of the GC in order to reduce the negative impact on normal read/write requests and SSD' lifetime at the greatest extent. Moreover, the WPO-GC was implemented on SSDsim which is an open source SSD simulator to evaluate its efficiency. The experimental results show that the proposed algorithm can decreases read I/O response time by 20%-40% and write I/O response time by 17%-40% respectively, and balance wear nearly 30% to extend the lifetime, compared with typical GC algorithm.
Reference | Related Articles | Metrics
Fast segmentation of sign language video based on cellular neural network
ZHANG Aihua LEI Xiaoya CHEN Xiaolei CHEN Lili
Journal of Computer Applications    2013, 33 (02): 503-506.   DOI: 10.3724/SP.J.1087.2013.00503
Abstract912)      PDF (564KB)(326)       Save
To achieve sign language video coding of region of interest, and improve call efficiency, a fast segmentation methodology of sign language video based on Cellular Neural Network (CNN) was proposed. Firstly, the skin regions of sign language video were detected through corresponding CNN templates by using the skin color information characteristics. Secondly, CNN based motion detection was carried out on the skin detection results by using inter-frame difference algorithm, and then the initial gesture region could be obtained. Finally, morphological processing methods were employed to fill small holes and smooth the boundaries of regions, and eventually the segmentation of the face and hands regions of sign language video image sequence was realized. The results show that the method can rapidly and accurately segment sign language video.
Related Articles | Metrics
Split-iterated algorithm for 2D bar code encoding of the GB18030-character-set-based Chinese ideograms
Chun-Lei Xia Shu-Guang Dai Ren-Jie Zhang
Journal of Computer Applications   
Abstract1863)      PDF (181KB)(886)       Save
It is very important for the wide use of 2D bar codes in certificates to encode the Chinese ideograms that are rarely seen and used in social life while more people would like to have them in the name to show its uniqueness. GB18030—2005 character set makes it possible to solve such a question with nearly 27 000 Characters included. The encoding regulations for Chinese ideograms in GB18030 were analyzed here and then the 2D bar code PDF417 encoding method of the mixed information of Chinese ideograms with ASCII characters was put forward. A special byte-compression method named split-iterated algorithm which could save memory without loss of encoding efficiency and could be realized on 16-bit programming tools was introduced here too. An example was given in this paper to illustrate the whole encoding process.
Related Articles | Metrics